Skip to content

Conversation

@jan-service-account
Copy link

Updates dev branch with latest release (b6829) from ggml-org/llama.cpp

mmichel11 and others added 7 commits October 23, 2025 09:05
…ng (ggml-org#16644)

* sycl: use async memory allocation to fix graph recording failures

GGML_SYCL_DISABLE_GRAPHS=0 causes crashes because:
  - Host waits are currently unsupported in graph recording mode.
  - SYCL malloc / free calls are unsupported in graph recording mode.

The following changes are made to fix SYCL graph functionality:
  - When graphs are enabled, use the SYCL async memory extension for temp
    buffers which is supported with SYCL graphs.
  - For compiler versions that do not support this extension, skip
    graphs with the affected op.
  - Switch from USM shared to device memory as the async extension
    currently just supports device allocations.

* Address reviewer feedback

* Use global async variable to decide path in sycl_ext_[malloc_device|free]
* mtmd-cli : allow using --jinja

* support -sys

* implement chat_history

* fix clear memory

* rm -sys support, added TODO
* Make mistral-common dependency optional

* Fix typing
@jan-service-account jan-service-account merged commit ab72358 into dev Oct 24, 2025
3 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2025-10-24-00-31 branch October 24, 2025 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants